LOAD DATA

data<-read.csv("auto.csv", header= TRUE)

The missing values are the first 22 obs, so we delete them

data <- data[-(1:22),]

FIRST ANALYSIS

Time Series plot

Now we create a time series object and plot it

Ita_ts <- ts(data$Italy, start = c(1990, 1), 
              end = c(2024, 2), frequency =12) 
plot(Ita_ts)

As we can see, there is a strong stagional component and a structural break around March 2020 (due to the Covid epidemy)

Stationarity Test

 kpss.test(Ita_ts)
## Warning in kpss.test(Ita_ts): p-value smaller than printed p-value
## 
##  KPSS Test for Level Stationarity
## 
## data:  Ita_ts
## KPSS Level = 2.267, Truncation lag parameter = 5, p-value = 0.01
adf.test(Ita_ts)
## 
##  Augmented Dickey-Fuller Test
## 
## data:  Ita_ts
## Dickey-Fuller = -3.3571, Lag order = 7, p-value = 0.06132
## alternative hypothesis: stationary

Even if the ADF-Test can suggest stationarity for certain values of alfa, the KPPS suggests that the series is non stationary.

Correlation plots

acf(Ita_ts) 

pacf(Ita_ts)

Ljung-Box Test for autocorrelation

Box.test(Ita_ts, lag = 12, type = "Ljung-Box")
## 
##  Box-Ljung test
## 
## data:  Ita_ts
## X-squared = 821.85, df = 12, p-value < 2.2e-16

So there is autocorrelation

DECOMPOSITION

Using simple decomposition function

dec<-decompose(Ita_ts)
plot(dec)

Alternative method with MA

We calculate the series trend by using a moving average smoother. The order of the smoother should be derived from the frequency of the series. The general rule of thumb is to use two sides moving average with an order of frequency / 2. For example, in the case, as the frequency is monthly or 12, we will use two sides moving average, where each side is 6 consecutive observations. This means that for calculating the smoothed value of the t observation - we will average the t observation along with the previous and following 6 observations (i.e., averaging 13 data points).

library(plotly)
## Warning: il pacchetto 'plotly' è stato creato con R versione 4.3.3
## 
## Caricamento pacchetto: 'plotly'
## Il seguente oggetto è mascherato da 'package:MASS':
## 
##     select
## Il seguente oggetto è mascherato da 'package:ggplot2':
## 
##     last_plot
## Il seguente oggetto è mascherato da 'package:stats':
## 
##     filter
## Il seguente oggetto è mascherato da 'package:graphics':
## 
##     layout
library(dplyr)
## 
## ######################### Warning from 'xts' package ##########################
## #                                                                             #
## # The dplyr lag() function breaks how base R's lag() function is supposed to  #
## # work, which breaks lag(my_xts). Calls to lag(my_xts) that you type or       #
## # source() into this session won't work correctly.                            #
## #                                                                             #
## # Use stats::lag() to make sure you're not using dplyr::lag(), or you can add #
## # conflictRules('dplyr', exclude = 'lag') to your .Rprofile to stop           #
## # dplyr from breaking base R's lag() function.                                #
## #                                                                             #
## # Code in packages is not affected. It's protected by R's namespace mechanism #
## # Set `options(xts.warn_dplyr_breaks_lag = FALSE)` to suppress this warning.  #
## #                                                                             #
## ###############################################################################
## 
## Caricamento pacchetto: 'dplyr'
## Il seguente oggetto è mascherato da 'package:MASS':
## 
##     select
## I seguenti oggetti sono mascherati da 'package:xts':
## 
##     first, last
## I seguenti oggetti sono mascherati da 'package:plm':
## 
##     between, lag, lead
## Il seguente oggetto è mascherato da 'package:car':
## 
##     recode
## I seguenti oggetti sono mascherati da 'package:stats':
## 
##     filter, lag
## I seguenti oggetti sono mascherati da 'package:base':
## 
##     intersect, setdiff, setequal, union
library(lubridate)
## Warning: il pacchetto 'lubridate' è stato creato con R versione 4.3.3
## 
## Caricamento pacchetto: 'lubridate'
## I seguenti oggetti sono mascherati da 'package:base':
## 
##     date, intersect, setdiff, union
library(TSstudio)
## Warning: il pacchetto 'TSstudio' è stato creato con R versione 4.3.3
smooth <- ts_ma(Ita_ts, n = 6,
                   separate = FALSE)

Let’s plot the series with the trend estimation:

smooth$plot %>%
  layout(legend = list(x = 0.1, y = 0.9))

Next step, we will convert the series and smoothed trend into a dataframe object with the ts_to_prophet function and merge the two series:

df <- ts_to_prophet(Ita_ts) %>% 
  select(date = ds, y) %>% 
  left_join(ts_to_prophet(smooth$ma_6) %>%
              select(date = ds, trend = y), by = "date")


head(df, 8)
##         date      y    trend
## 1 1990-01-01 310337       NA
## 2 1990-02-01 219117       NA
## 3 1990-03-01 225059       NA
## 4 1990-04-01 214460       NA
## 5 1990-05-01 232627       NA
## 6 1990-06-01 201085       NA
## 7 1990-07-01 206843 200193.7
## 8 1990-08-01 107075 192300.4

Note: the cost of using the moving average for trend estimation is lost of the first and last n observation. Where n is the order of the moving average, in this case, is the first and last 6 observations.

Next, we will remove the trend from the series by subtracting the trend estimation from the series:

df$detrend <- df$y - df$trend

head(df, 8)
##         date      y    trend    detrend
## 1 1990-01-01 310337       NA         NA
## 2 1990-02-01 219117       NA         NA
## 3 1990-03-01 225059       NA         NA
## 4 1990-04-01 214460       NA         NA
## 5 1990-05-01 232627       NA         NA
## 6 1990-06-01 201085       NA         NA
## 7 1990-07-01 206843 200193.7   6649.308
## 8 1990-08-01 107075 192300.4 -85225.385

And plot

ts_plot(df,
        title = "Car Registration Detrending") %>%
  layout(legend = list(x = 0.1, y = 0.9))

We can now overlap in a plot the various years:

df$year <- year(df$date)
df$month <- month(df$date)
p <- plot_ly()
for(i in unique(df$year)){
  temp <- NULL
  temp <- df %>% filter(year == i) 
  p <- p %>% add_lines(x = temp$month,
                       y = temp$detrend,
                       name = i)
  
}

p

Now, let’s calculate the seasonal component and add it to the seasonal plot above:

seasonal_comp <- df %>% 
  group_by(month) %>%
  summarise(month_avg = mean(detrend, na.rm = TRUE),
            .groups = "drop")
  

p %>% add_lines(x = seasonal_comp$month, 
                y = seasonal_comp$month_avg,
                line = list(color = "black", dash = "dash", width = 4),
                name = "Seasonal Component")

To calculate the irregular component, we will have to merge the seasonal component back to then to subtract from the series the estimated trend and seasonal components:

df <- df %>% left_join(seasonal_comp, by = "month")


df$irregular <- df$y - df$trend - df$month_avg
head(df)
##         date      y trend detrend year month month_avg irregular
## 1 1990-01-01 310337    NA      NA 1990     1 32983.627        NA
## 2 1990-02-01 219117    NA      NA 1990     2 23201.014        NA
## 3 1990-03-01 225059    NA      NA 1990     3 38937.100        NA
## 4 1990-04-01 214460    NA      NA 1990     4  7643.473        NA
## 5 1990-05-01 232627    NA      NA 1990     5 21652.772        NA
## 6 1990-06-01 201085    NA      NA 1990     6 19034.531        NA
ts_plot(df[, c("date", "y" ,"trend", "detrend", "month_avg", "irregular")], 
        title = "Car Registration and its Components",
        type = "multiple")

FORECASTING

Naive model 1

Naive model using the mean of the model

avg_model <- Arima(Ita_ts, c(0,0,0))
avg_forecast <- forecast(avg_model)
avg_forecast
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Mar 2024       163158.3 99609.21 226707.4 65968.35 260348.3
## Apr 2024       163158.3 99609.21 226707.4 65968.35 260348.3
## May 2024       163158.3 99609.21 226707.4 65968.35 260348.3
## Jun 2024       163158.3 99609.21 226707.4 65968.35 260348.3
## Jul 2024       163158.3 99609.21 226707.4 65968.35 260348.3
## Aug 2024       163158.3 99609.21 226707.4 65968.35 260348.3
## Sep 2024       163158.3 99609.21 226707.4 65968.35 260348.3
## Oct 2024       163158.3 99609.21 226707.4 65968.35 260348.3
## Nov 2024       163158.3 99609.21 226707.4 65968.35 260348.3
## Dec 2024       163158.3 99609.21 226707.4 65968.35 260348.3
## Jan 2025       163158.3 99609.21 226707.4 65968.35 260348.3
## Feb 2025       163158.3 99609.21 226707.4 65968.35 260348.3
## Mar 2025       163158.3 99609.21 226707.4 65968.35 260348.3
## Apr 2025       163158.3 99609.21 226707.4 65968.35 260348.3
## May 2025       163158.3 99609.21 226707.4 65968.35 260348.3
## Jun 2025       163158.3 99609.21 226707.4 65968.35 260348.3
## Jul 2025       163158.3 99609.21 226707.4 65968.35 260348.3
## Aug 2025       163158.3 99609.21 226707.4 65968.35 260348.3
## Sep 2025       163158.3 99609.21 226707.4 65968.35 260348.3
## Oct 2025       163158.3 99609.21 226707.4 65968.35 260348.3
## Nov 2025       163158.3 99609.21 226707.4 65968.35 260348.3
## Dec 2025       163158.3 99609.21 226707.4 65968.35 260348.3
## Jan 2026       163158.3 99609.21 226707.4 65968.35 260348.3
## Feb 2026       163158.3 99609.21 226707.4 65968.35 260348.3
plot(avg_forecast)

avg_forecast$mean
##           Jan      Feb      Mar      Apr      May      Jun      Jul      Aug
## 2024                   163158.3 163158.3 163158.3 163158.3 163158.3 163158.3
## 2025 163158.3 163158.3 163158.3 163158.3 163158.3 163158.3 163158.3 163158.3
## 2026 163158.3 163158.3                                                      
##           Sep      Oct      Nov      Dec
## 2024 163158.3 163158.3 163158.3 163158.3
## 2025 163158.3 163158.3 163158.3 163158.3
## 2026

Naive model 2

Naive model using a RW

RW<-naive(Ita_ts,50)
RW
##          Point Forecast       Lo 80    Hi 80       Lo 95    Hi 95
## Mar 2024         147094   83319.434 210868.6   49559.218 244628.8
## Apr 2024         147094   56903.144 237284.9    9158.989 285029.0
## May 2024         147094   36633.212 257554.8  -21841.197 316029.2
## Jun 2024         147094   19544.868 274643.1  -47975.563 342163.6
## Jul 2024         147094    4489.736 289698.3  -71000.402 365188.4
## Aug 2024         147094   -9121.145 303309.1  -91816.447 386004.4
## Sep 2024         147094  -21637.641 315825.6 -110958.776 405146.8
## Oct 2024         147094  -33287.712 327475.7 -128776.022 422964.0
## Nov 2024         147094  -44229.697 338417.7 -145510.345 439698.3
## Dec 2024         147094  -54578.885 348766.9 -161338.061 455526.1
## Jan 2025         147094  -64422.306 358610.3 -176392.275 470580.3
## Feb 2025         147094  -73827.576 368015.6 -190776.394 484964.4
## Mar 2025         147094  -82848.467 377036.5 -204572.656 498760.7
## Apr 2025         147094  -91528.575 385716.6 -217847.736 512035.7
## May 2025         147094  -99903.831 394091.8 -230656.585 524844.6
## Jun 2025         147094 -108004.263 402192.3 -243045.126 537233.1
## Jul 2025         147094 -115855.271 410043.3 -255052.207 549240.2
## Aug 2025         147094 -123478.568 417666.6 -266711.033 560899.0
## Sep 2025         147094 -130892.887 425080.9 -278050.256 572238.3
## Oct 2025         147094 -138114.529 432302.5 -289094.804 583282.8
## Nov 2025         147094 -145157.775 439345.8 -299866.520 594054.5
## Dec 2025         147094 -152035.228 446223.2 -310384.677 604572.7
## Jan 2026         147094 -158758.073 452946.1 -320666.380 614854.4
## Feb 2026         147094 -165336.289 459524.3 -330726.894 624914.9
## Mar 2026         147094 -171778.829 465966.8 -340579.908 634767.9
## Apr 2026         147094 -178093.755 472281.8 -350237.755 644425.8
## May 2026         147094 -184288.364 478476.4 -359711.592 653899.6
## Jun 2026         147094 -190369.282 484557.3 -369011.553 663199.6
## Jul 2026         147094 -196342.547 490530.5 -378146.873 672334.9
## Aug 2026         147094 -202213.683 496401.7 -387126.000 681314.0
## Sep 2026         147094 -207987.755 502175.8 -395956.681 690144.7
## Oct 2026         147094 -213669.423 507857.4 -404646.044 698834.0
## Nov 2026         147094 -219262.988 513451.0 -413200.663 707388.7
## Dec 2026         147094 -224772.425 518960.4 -421626.620 715814.6
## Jan 2027         147094 -230201.419 524389.4 -429929.550 724117.5
## Feb 2027         147094 -235553.395 529741.4 -438114.690 732302.7
## Mar 2027         147094 -240831.539 535019.5 -446186.915 740374.9
## Apr 2027         147094 -246038.826 540226.8 -454150.773 748338.8
## May 2027         147094 -251178.036 545366.0 -462010.516 756198.5
## Jun 2027         147094 -256251.769 550439.8 -469770.122 763958.1
## Jul 2027         147094 -261262.468 555450.5 -477433.324 771621.3
## Aug 2027         147094 -266212.424 560400.4 -485003.629 779191.6
## Sep 2027         147094 -271103.794 565291.8 -492484.334 786672.3
## Oct 2027         147094 -275938.612 570126.6 -499878.549 794066.5
## Nov 2027         147094 -280718.793 574906.8 -507189.205 801377.2
## Dec 2027         147094 -285446.150 579634.1 -514419.074 808607.1
## Jan 2028         147094 -290122.395 584310.4 -521570.774 815758.8
## Feb 2028         147094 -294749.153 588937.2 -528646.789 822834.8
## Mar 2028         147094 -299327.960 593516.0 -535649.471 829837.5
## Apr 2028         147094 -303860.279 598048.3 -542581.055 836769.1
plot(RW)

Naive model 3

Naive model using RW with drift

RWD<-rwf(Ita_ts,50, drift = TRUE)
RWD
##          Point Forecast       Lo 80    Hi 80       Lo 95    Hi 95
## Mar 2024       146694.9   82766.244 210623.5   48924.473 244465.3
## Apr 2024       146295.7   55776.825 236814.7    7859.002 284732.5
## May 2024       145896.6   34899.247 256894.0  -23859.212 315652.4
## Jun 2024       145497.5   17173.316 273821.7  -50757.399 341752.4
## Jul 2024       145098.4    1453.987 288742.7  -74586.754 364783.5
## Aug 2024       144699.2  -12845.222 302243.7  -96244.221 385642.7
## Sep 2024       144300.1  -26072.252 314672.5 -116261.933 404862.2
## Oct 2024       143901.0  -38453.515 326255.5 -134986.156 422788.1
## Nov 2024       143501.9  -50146.073 337149.8 -152657.094 439660.8
## Dec 2024       143102.7  -61264.131 347469.6 -169449.411 455654.9
## Jan 2025       142703.6  -71893.794 357301.0 -185494.794 470902.0
## Feb 2025       142304.5  -82101.870 366710.8 -200895.413 485504.4
## Mar 2025       141905.3  -91941.415 375752.1 -215732.413 499543.1
## Apr 2025       141506.2 -101455.392 384467.8 -230071.501 513083.9
## May 2025       141107.1 -110679.162 392893.3 -243966.753 526180.9
## Jun 2025       140708.0 -119642.234 401058.2 -257463.305 538879.2
## Jul 2025       140308.8 -128369.541 408987.2 -270599.283 551217.0
## Aug 2025       139909.7 -136882.362 416701.8 -283407.235 563226.7
## Sep 2025       139510.6 -145199.038 424220.2 -295915.207 574936.4
## Oct 2025       139111.5 -153335.498 431558.4 -308147.564 586370.5
## Nov 2025       138712.3 -161305.686 438730.3 -320125.630 597550.3
## Dec 2025       138313.2 -169121.887 445748.3 -331868.193 608494.6
## Jan 2026       137914.1 -176794.988 452623.1 -343391.903 619220.1
## Feb 2026       137514.9 -184334.692 459364.6 -354711.600 629741.5
## Mar 2026       137115.8 -191749.688 465981.3 -365840.573 640072.2
## Apr 2026       136716.7 -199047.794 472481.2 -376790.778 650224.2
## May 2026       136317.6 -206236.072 478871.2 -387573.016 660208.2
## Jun 2026       135918.4 -213320.928 485157.8 -398197.082 670034.0
## Jul 2026       135519.3 -220308.191 491346.8 -408671.893 679710.5
## Aug 2026       135120.2 -227203.184 497443.6 -419005.590 689246.0
## Sep 2026       134721.1 -234010.785 503452.9 -429205.632 698647.7
## Oct 2026       134321.9 -240735.473 509379.3 -439278.869 707922.7
## Nov 2026       133922.8 -247381.374 515227.0 -449231.613 717077.2
## Dec 2026       133523.7 -253952.299 520999.7 -459069.691 726117.0
## Jan 2027       133124.6 -260451.775 526700.9 -468798.497 735047.6
## Feb 2027       132725.4 -266883.075 532333.9 -478423.036 743873.9
## Mar 2027       132326.3 -273249.240 537901.8 -487947.960 752600.6
## Apr 2027       131927.2 -279553.104 543407.4 -497377.603 761231.9
## May 2027       131528.0 -285797.312 548853.4 -506716.010 769772.1
## Jun 2027       131128.9 -291984.336 554242.2 -515966.961 778224.8
## Jul 2027       130729.8 -298116.490 559576.1 -525133.997 786593.6
## Aug 2027       130330.7 -304195.948 564857.3 -534220.440 794881.8
## Sep 2027       129931.5 -310224.749 570087.8 -543229.410 803092.5
## Oct 2027       129532.4 -316204.813 575269.6 -552163.843 811228.7
## Nov 2027       129133.3 -322137.947 580404.5 -561026.504 819293.1
## Dec 2027       128734.2 -328025.858 585494.2 -569820.002 827288.3
## Jan 2028       128335.0 -333870.157 590540.2 -578546.801 835216.8
## Feb 2028       127935.9 -339672.367 595544.2 -587209.230 843081.0
## Mar 2028       127536.8 -345433.931 600507.5 -595809.496 850883.0
## Apr 2028       127137.6 -351156.216 605431.5 -604349.690 858625.0
plot(RWD)

Exponential Smoothing

fit <- ets(Ita_ts)
forecast<-forecast(fit)
forecast
##          Point Forecast     Lo 80     Hi 80     Lo 95    Hi 95
## Mar 2024      159519.97 133226.90 185813.03 119308.19 199731.7
## Apr 2024      126371.71 104783.25 147960.17  93355.01 159388.4
## May 2024      156823.30 129121.63 184524.97 114457.25 199189.3
## Jun 2024      156618.45 128071.02 185165.88 112958.92 200278.0
## Jul 2024      135863.34 110356.00 161370.67  96853.24 174873.4
## Aug 2024       80259.44  64764.32  95754.56  56561.70 103957.2
## Sep 2024      134571.92 107893.49 161250.35  93770.78 175373.1
## Oct 2024      136761.20 108957.07 164565.34  94238.45 179284.0
## Nov 2024      133740.30 105889.55 161591.05  91146.25 176334.3
## Dec 2024      111937.26  88086.01 135788.51  75459.92 148414.6
## Jan 2025      141610.32 110766.54 172454.09  94438.84 188781.8
## Feb 2025      147309.39 114541.48 180077.29  97195.20 197423.6
## Mar 2025      159650.56 122132.39 197168.73 102271.47 217029.7
## Apr 2025      126475.17  96206.62 156743.71  80183.42 172766.9
## May 2025      156951.68 118722.24 195181.13  98484.79 215418.6
## Jun 2025      156746.67 117911.53 195581.80  97353.46 216139.9
## Jul 2025      135974.56 101725.91 170223.22  83595.76 188353.4
## Aug 2025       80325.15  59767.35 100882.94  48884.72 111765.6
## Sep 2025      134682.09  99674.11 169690.06  81142.01 188222.2
## Oct 2025      136873.16 100756.05 172990.27  81636.81 192109.5
## Nov 2025      133849.79  98009.92 169689.65  79037.45 188662.1
## Dec 2025      112028.90  81601.88 142455.92  65494.79 158563.0
## Jan 2026      141726.25 102696.44 180756.05  82035.31 201417.2
## Feb 2026      147429.98 106277.74 188582.22  84493.06 210366.9
plot(forecast)

summary(fit)
## ETS(M,N,M) 
## 
## Call:
##  ets(y = Ita_ts) 
## 
##   Smoothing parameters:
##     alpha = 0.2702 
##     gamma = 0.1832 
## 
##   Initial states:
##     l = 187481.9588 
##     s = 0.6401 0.8632 0.9606 0.826 0.5217 1.1355
##            1.0948 1.1422 1.1025 1.1927 1.085 1.4357
## 
##   sigma:  0.1286
## 
##      AIC     AICc      BIC 
## 10607.99 10609.21 10668.23 
## 
## Training set error measures:
##                     ME     RMSE      MAE       MPE    MAPE      MASE     ACF1
## Training set -1158.965 20603.44 13158.38 -9.244998 16.1703 0.6146821 0.228454

Exponential Smoothing Holt and Winters

s.exp1          <- HoltWinters(Ita_ts, alpha = 0.1, beta=FALSE, gamma=FALSE, l.start=23.56)
s.exp7          <- HoltWinters(Ita_ts, alpha = 0.7, beta=FALSE, gamma=FALSE, l.start=23.56)
s.exp.estimated <- HoltWinters(Ita_ts)

s.exp.estimated$alpha 
##     alpha 
## 0.4469785
forecast(s.exp.estimated)
##          Point Forecast     Lo 80    Hi 80      Lo 95    Hi 95
## Mar 2024      157737.23 130845.81 184628.6 116610.360 198864.1
## Apr 2024      126985.53  97530.04 156441.0  81937.248 172033.8
## May 2024      157404.37 125590.80 189217.9 108749.714 206059.0
## Jun 2024      157925.04 123916.50 191933.6 105913.461 209936.6
## Jul 2024      137265.82 101195.64 173336.0  82101.232 192430.4
## Aug 2024       87520.53  49500.33 125540.7  29373.644 145667.4
## Sep 2024      134439.86  94564.88 174314.8  73456.345 195423.4
## Oct 2024      135350.49  93703.27 176997.7  71656.563 199044.4
## Nov 2024      132305.89  88958.82 175653.0  66012.263 198599.5
## Dec 2024      109167.73  64184.99 154150.5  40372.573 177962.9
## Jan 2025      135724.49  89163.53 182285.5  64515.637 206933.4
## Feb 2025      141919.83  93832.40 190007.3  68376.448 215463.2
## Mar 2025      152633.44 100908.05 204358.8  73526.290 231740.6
## Apr 2025      121881.74  68778.13 174985.3  40666.782 203096.7
## May 2025      152300.58  97853.63 206747.5  69031.156 235570.0
## Jun 2025      152821.25  97063.31 208579.2  67546.834 238095.7
## Jul 2025      132162.03  75123.23 189200.8  44928.704 219395.4
## Aug 2025       82416.74  24125.20 140708.3  -6732.470 171566.0
## Sep 2025      129336.07  69818.16 188854.0  38311.289 220360.8
## Oct 2025      130246.70  69527.20 190966.2  37384.234 223109.2
## Nov 2025      127202.10  65304.32 189099.9  32537.607 221866.6
## Dec 2025      104063.94  41009.89 167118.0   7631.087 200496.8
## Jan 2026      130620.71  66431.21 194810.2  32451.347 228790.1
## Feb 2026      136816.05  71510.85 202121.2  36940.366 236691.7
plot(s.exp.estimated)

s.exp.estimated$SSE
## [1] 174953364109

ARIMA

As a rule of thumb, the ets models are not stationary, and are used mostly when there is a trend and/or seasonality in the data, as this model explicitly these components.
On the other hand, ARIMA models should be transformed in stationary and should be used when if you see autocorrelation in the data, i.e. the past data explains the present data well

SARIMA fit

miarma=auto.arima(Ita_ts)
summary(miarma)
## Series: Ita_ts 
## ARIMA(1,0,1)(0,1,2)[12] 
## 
## Coefficients:
##          ar1      ma1     sma1     sma2
##       0.9086  -0.4370  -0.6367  -0.0959
## s.e.  0.0294   0.0639   0.0525   0.0468
## 
## sigma^2 = 430785910:  log likelihood = -4523.53
## AIC=9057.05   AICc=9057.21   BIC=9076.99
## 
## Training set error measures:
##                     ME     RMSE     MAE       MPE     MAPE     MASE
## Training set -1040.454 20346.37 13359.8 -6.521588 14.34121 0.624091
##                      ACF1
## Training set -0.000608621
checkresiduals(miarma)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(1,0,1)(0,1,2)[12]
## Q* = 54.639, df = 20, p-value = 4.648e-05
## 
## Model df: 4.   Total lags used: 24

We have a MAPE of 14.5 %, not bad, but hopefully we can do better.

Test of normality for the residuals

shapiro.test(miarma$residuals)
## 
##  Shapiro-Wilk normality test
## 
## data:  miarma$residuals
## W = 0.89232, p-value < 2.2e-16

We reject the hypotesis of normality of the residuals, this can generate problems in the forecast.

Plot the residuals to check for autocorrelation

plot(miarma$residuals)

acf(miarma$residuals)

No autocorrelation

Let’s make forecast

l.forecast <- forecast(miarma)
plot(l.forecast)

SARIMA model with log transformation

miarma2=auto.arima(log(Ita_ts))
summary(miarma2)
## Series: log(Ita_ts) 
## ARIMA(1,0,2)(0,1,1)[12] 
## 
## Coefficients:
##          ar1      ma1      ma2     sma1
##       0.9572  -0.3204  -0.4168  -0.8365
## s.e.  0.0272   0.0586   0.0593   0.0287
## 
## sigma^2 = 0.04467:  log likelihood = 48.81
## AIC=-87.61   AICc=-87.46   BIC=-67.68
## 
## Training set error measures:
##                        ME      RMSE       MAE        MPE      MAPE      MASE
## Training set -0.009564315 0.2071882 0.1012619 -0.1103667 0.8768409 0.6658212
##                     ACF1
## Training set 0.005235695
checkresiduals(miarma2)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(1,0,2)(0,1,1)[12]
## Q* = 23.183, df = 20, p-value = 0.2799
## 
## Model df: 4.   Total lags used: 24

Test of normality for the residuals

shapiro.test(miarma2$residuals)
## 
##  Shapiro-Wilk normality test
## 
## data:  miarma2$residuals
## W = 0.57884, p-value < 2.2e-16

Plot the residuals to check for autocorrelation

plot(miarma2$residuals)

acf(miarma2$residuals)

Let’s make forecast

l.forecast2 <- forecast(miarma2)
plot(l.forecast2)

Try to create a VAR model using all of the nations

data2 <- data[-(1:12),]
data_ts<- ts(data2[,c(6,7,11,12,14,19)], start = c(1990, 1), 
        end = c(2024, 2), frequency =12) 
sum(is.na(data_ts))
## [1] 0
plot(data_ts)

Var_model <- VAR(data_ts, lag.max =12)

Chosen order:

Var_model$p
## AIC(n) 
##     12

Each model is like an lm object with its own coefficients and residuals:

residuals(Var_model$varresult$Italy)
##             1             2             3             4             5 
##   14465.23012  -23298.57929   15366.43305   48270.73924  -12469.59035 
##             6             7             8             9            10 
##   25378.30384   11861.83902  -57154.36630   -6692.73930   10199.88264 
##            11            12            13            14            15 
##    8455.04746   27565.93716  -13695.89612  -23261.69942  -34964.18503 
##            16            17            18            19            20 
##  -27743.64909  -15454.94667  -33938.28053  -22940.03040   -6828.76389 
##            21            22            23            24            25 
##   -3970.65990  -20121.02044   -4780.27699  -11255.02194   18146.41698 
##            26            27            28            29            30 
##  -25576.06949   12216.03294  -23039.52080   11603.71323   -2401.24375 
##            31            32            33            34            35 
##  -16175.54538   11215.03399  -10015.98924   15397.61866    7533.18325 
##            36            37            38            39            40 
##   -2107.44290   18225.12656  -24643.44477    4379.22263   -9615.34337 
##            41            42            43            44            45 
##   18549.76311    4129.15175  -12620.80415    8488.74366  -11615.27879 
##            46            47            48            49            50 
##   -7962.00573   33763.45425   -3544.00682    4560.04699  -11461.19826 
##            51            52            53            54            55 
##    1115.16090   -2456.79541    1070.44404    4107.51233  -22772.95711 
##            56            57            58            59            60 
##    2871.50126    4218.20340   15068.39864   -3016.12502    -141.60103 
##            61            62            63            64            65 
##   -1530.01172   26602.97987   16651.00207   26085.92403   31873.23750 
##            66            67            68            69            70 
##   14216.36736   30429.52304  -24506.02082   57843.92940   34107.49615 
##            71            72            73            74            75 
##   13943.01319    -715.32542   33323.47960    6284.95900  -17118.91237 
##            76            77            78            79            80 
##   16350.98984   -5567.25906  -11653.44449   35045.11080  -12401.93070 
##            81            82            83            84            85 
##  -13082.57243  -13333.77837    8869.82017   -3909.88376  -11190.78532 
##            86            87            88            89            90 
##   18516.77324   24844.25367   -4825.78514    5931.86453    8934.23552 
##            91            92            93            94            95 
##   -4550.28599  -12263.06893  -17226.40709    1869.71977    4748.93793 
##            96            97            98            99           100 
##  -26916.23191   58068.26619   37504.68814   12008.46797  -10485.72070 
##           101           102           103           104           105 
##   -1865.39476  -31573.42991   16382.35622   25575.62359   -8663.75120 
##           106           107           108           109           110 
##   -6290.05140  -21286.37488     281.13721   53126.87033    2766.36587 
##           111           112           113           114           115 
##   17411.07105    1420.43488   -8718.85341    2409.47475  -15317.23852 
##           116           117           118           119           120 
##   -6171.86210  -25245.88683   27456.05929   -3369.20137  -11630.11545 
##           121           122           123           124           125 
##   37069.33744  -27880.04464  -15620.96824  -10865.86010   -1505.84720 
##           126           127           128           129           130 
##   -7251.29637   -4006.97372   -1859.23265   -2307.54412    6988.43567 
##           131           132           133           134           135 
##    7986.44158   55801.51506  -35352.35439    8739.51063   40325.71962 
##           136           137           138           139           140 
##  -49708.85532  -12454.82025    9307.20757   39083.62390  -25381.42561 
##           141           142           143           144           145 
##   25073.93121   20841.95623  -13865.46087  -14329.51298   22720.93823 
##           146           147           148           149           150 
##   19243.04749   -7079.86529     888.47921    -513.82435  -10663.32265 
##           151           152           153           154           155 
##   -1981.15622  -18508.04329    7537.99416  -15705.48308    3221.23815 
##           156           157           158           159           160 
##    3809.21178   -7348.39590    9258.75889  -17525.35426    7460.25279 
##           161           162           163           164           165 
##  -52354.26961   53638.64585    6172.87216    1876.50989    6415.98240 
##           166           167           168           169           170 
##  -18510.69213   19327.40714  -20382.03669   37684.40101    5298.30737 
##           171           172           173           174           175 
##    2151.48012  -19983.21555   27515.44326  -11250.56492  -24136.81664 
##           176           177           178           179           180 
##   17180.09187   -6172.28318   15499.05211    7005.86087  -10016.58883 
##           181           182           183           184           185 
##   38876.16493   -1750.11897   -6512.43955   10416.33663   16932.57279 
##           186           187           188           189           190 
##  -12655.59547  -11199.14793    8082.92064   -8946.07056   20181.00617 
##           191           192           193           194           195 
##    8208.74670   10950.18152    -729.94594   -4090.46239  -29381.17940 
##           196           197           198           199           200 
##   15265.80052  -11207.99484  -24271.28923    4948.88694    5090.69665 
##           201           202           203           204           205 
##     104.26908  -44734.05557  -10658.87387   16114.44678  -72653.95689 
##           206           207           208           209           210 
##    4363.56473    2474.33756    8889.85839   -5632.67846    2283.71091 
##           211           212           213           214           215 
##   12149.61311  -33290.47547   17015.54787   22227.61780   12043.61300 
##           216           217           218           219           220 
##   -5528.52372   -6937.69101   18670.31345   18435.97049  -39517.11310 
##           221           222           223           224           225 
##  -26719.17915   -6195.36606   -8671.54938    8127.42620    6927.21362 
##           226           227           228           229           230 
##   -5503.74148    7229.60047   -2690.96038  -17320.57689   -2136.36381 
##           231           232           233           234           235 
##   -3160.05302   16512.85034   -7566.62836   -6614.84395   -9418.68955 
##           236           237           238           239           240 
##   -1257.29703    6967.89859   -7222.92312    1274.26984   -8110.35400 
##           241           242           243           244           245 
##  -18584.61591   -8296.75620  -23253.95100   17059.10204     211.69738 
##           246           247           248           249           250 
##  -14163.31467  -16187.55758  -13574.21051  -19455.70778   12327.32145 
##           251           252           253           254           255 
##  -56477.64780  -14843.92534   -3172.28067  -20834.26169  -13465.80844 
##           256           257           258           259           260 
##   15456.49259    5333.06237   -6475.52370   -9340.25265   -3161.21857 
##           261           262           263           264           265 
##    2576.43531   -2095.11680   -4286.38272   -7289.66959  -14900.73550 
##           266           267           268           269           270 
##   -2360.89663     772.37779    7446.11078   -2313.82287   -2334.37774 
##           271           272           273           274           275 
##   -2437.07270  -10839.47838   -5611.86570   -1456.59583   -3252.86260 
##           276           277           278           279           280 
##   -5496.22719   -8316.71468    5045.42356    8830.44440   16145.44974 
##           281           282           283           284           285 
##   -3760.95462     -26.42791    6233.24086  -21565.12575    7048.53906 
##           286           287           288           289           290 
##   -1773.11070   12845.37005  -11805.12682    3341.29585   33201.90067 
##           291           292           293           294           295 
##    7886.18394   11501.57317    9604.94370   -2278.59498  -12501.87219 
##           296           297           298           299           300 
##  -18086.47418   24815.99227    1913.27795    -533.75799  -11366.54551 
##           301           302           303           304           305 
##    4517.00621   24071.11152   26701.71962  -16344.06040    7853.96795 
##           306           307           308           309           310 
##    4715.67279  -21061.79336  -17606.40782    8555.58234    9840.27451 
##           311           312           313           314           315 
##   -5357.45033  -18080.76038    9281.08662    7983.78928   18092.63087 
##           316           317           318           319           320 
##    9255.07136    -613.23540    -498.79603  -31261.37040   -6713.55897 
##           321           322           323           324           325 
##   19144.23595   15055.55014  -32011.34399  -12669.77883  -21382.78683 
##           326           327           328           329           330 
##   22412.48887    6896.67715    3282.29425   13742.80559   -2647.71032 
##           331           332           333           334           335 
##  -10345.15372  -28751.64789   -3912.15970   10855.81820   -9376.81540 
##           336           337           338           339           340 
##   11471.75290  -31104.18548  -16106.71982 -118791.09917  -79535.54967 
##           341           342           343           344           345 
##    -332.15172   13579.52574   31285.15531  -13950.86367   18920.15158 
##           346           347           348           349           350 
##   12125.24577   26788.52670   39163.13996   -7147.95540  -17680.52127 
##           351           352           353           354           355 
##   10243.05846   20224.09304    2195.70971   11312.68927   -5994.66638 
##           356           357           358           359           360 
##  -14872.16731   -5034.14286  -20707.69491    2038.65231  -14084.53968 
##           361           362           363           364           365 
##  -21634.44694  -11986.85619  -23832.67048   -6654.49920   10464.75674 
##           366           367           368           369           370 
##    2324.46452    1282.82534  -15729.99099   -4679.67323   10487.13877 
##           371           372           373           374           375 
##   11879.55396    9090.02012   36295.30831    2019.06342    6075.44825 
##           376           377           378           379           380 
##    6653.37382    6398.62996  -11137.13833   -4135.92620    7258.20662 
##           381           382           383           384           385 
##   -1781.56185   16815.91598    5470.61890   -1912.31648   16326.80458 
##           386           387           388           389           390 
##   -2300.88806  130480.72777    7291.50875   15127.16877  -27862.85749 
##           391           392           393           394           395 
##    1761.03518   59798.47937   21966.50101  -30145.94640  -21720.66222 
##           396           397           398 
##   26631.29985   -1689.64735   -2984.31158

Predictions

predict(Var_model)
## $Finland
##            fcst     lower     upper       CI
##  [1,] 10736.157 7685.0269 13787.286 3051.130
##  [2,]  7093.412 4003.0533 10183.770 3090.359
##  [3,]  6607.925 3375.1330  9840.716 3232.792
##  [4,]  7610.973 4221.8312 11000.116 3389.142
##  [5,]  8460.372 5004.7314 11916.013 3455.641
##  [6,]  5390.049 1879.8890  8900.209 3510.160
##  [7,]  4161.065  584.6641  7737.466 3576.401
##  [8,]  5665.090 2055.6534  9274.526 3609.436
##  [9,]  6019.652 2352.2390  9687.065 3667.413
## [10,]  4930.802 1205.3865  8656.218 3725.416
## 
## $Sweden
##            fcst      lower    upper       CI
##  [1,] 11781.221  4705.5096 18856.93 7075.711
##  [2,] 15393.869  8032.5651 22755.17 7361.304
##  [3,] 18768.058 11024.4384 26511.68 7743.620
##  [4,] 19388.452 11304.7219 27472.18 8083.731
##  [5,] 15723.355  7548.0415 23898.67 8175.314
##  [6,] 17921.210  9599.9347 26242.48 8321.275
##  [7,] 13677.827  5031.7457 22323.91 8646.081
##  [8,]  9548.097   733.8834 18362.31 8814.213
##  [9,] 13387.102  4402.9062 22371.30 8984.196
## [10,] 15615.606  6346.4259 24884.79 9269.180
## 
## $Italy
##           fcst     lower    upper       CI
##  [1,] 269866.8 224805.17 314928.4 45061.60
##  [2,] 223766.0 173797.55 273734.4 49968.41
##  [3,] 213906.9 160446.12 267367.7 53460.77
##  [4,] 206375.2 149892.31 262858.1 56482.91
##  [5,] 206859.9 146491.99 267227.8 60367.91
##  [6,] 188496.1 127518.46 249473.6 60977.59
##  [7,] 184166.7 122347.90 245985.4 61818.77
##  [8,] 101878.2  38708.09 165048.3 63170.13
##  [9,] 153282.3  88783.99 217780.6 64498.31
## [10,] 182972.4 117749.19 248195.6 65223.22
## 
## $Gerany
##           fcst    lower    upper       CI
##  [1,] 282595.4 223431.5 341759.2 59163.82
##  [2,] 269821.6 203438.4 336204.8 66383.21
##  [3,] 364669.8 294270.3 435069.2 70399.42
##  [4,] 371330.6 297413.2 445248.0 73917.40
##  [5,] 329319.2 251344.4 407293.9 77974.71
##  [6,] 357624.4 277169.7 438079.2 80454.75
##  [7,] 350464.5 267647.9 433281.1 82816.55
##  [8,] 230799.3 145771.5 315827.1 85027.80
##  [9,] 264384.0 178789.2 349978.8 85594.81
## [10,] 277144.4 191174.4 363114.5 85970.07
## 
## $Norway
##           fcst     lower     upper       CI
##  [1,] 5303.020  703.0957  9902.945 4599.925
##  [2,] 6860.692 2126.7441 11594.639 4733.947
##  [3,] 7891.515 3058.4923 12724.538 4833.023
##  [4,] 7727.703 2622.9608 12832.445 5104.742
##  [5,] 9261.967 4057.7652 14466.170 5204.202
##  [6,] 8154.561 2864.5390 13444.582 5290.022
##  [7,] 7764.193 2145.0083 13383.379 5619.185
##  [8,] 7696.153 2007.5523 13384.753 5688.600
##  [9,] 8288.532 2515.7485 14061.316 5772.784
## [10,] 7572.705 1625.5735 13519.837 5947.132
## 
## $UK
##            fcst     lower    upper       CI
##  [1,] 155775.01  77632.63 233917.4 78142.38
##  [2,] 103776.83  24326.75 183226.9 79450.08
##  [3,] 157615.50  75949.20 239281.8 81666.29
##  [4,] 157845.92  75653.31 240038.5 82192.60
##  [5,] 103564.24  21174.71 185953.8 82389.53
##  [6,]  94640.71  11858.75 177422.7 82781.96
##  [7,]  36202.67 -49034.46 121439.8 85237.13
##  [8,] 319697.05 233799.93 405594.2 85897.11
##  [9,] 124432.47  37439.92 211425.0 86992.54
## [10,] 125317.53  38156.14 212478.9 87161.38

The predictions are not that good, we were expecting that: the residuals are high and coefficients low. So: The countries cannot explain each others’ behavior.

#ARIMAX

#Import time series about gas prices 
carburanti<-read.csv("Carburanti.csv", header= TRUE)
Benzina_ts <- ts(carburanti$Benzina, start = c(1996, 1), 
             end = c(2024, 2), frequency =12) 
plot(Benzina_ts)

plot(diff(Benzina_ts))

#To use ARIMAX we need both series to be stationary 
adf.test(Ita_ts)
## 
##  Augmented Dickey-Fuller Test
## 
## data:  Ita_ts
## Dickey-Fuller = -3.3571, Lag order = 7, p-value = 0.06132
## alternative hypothesis: stationary
adf.test(Benzina_ts)
## 
##  Augmented Dickey-Fuller Test
## 
## data:  Benzina_ts
## Dickey-Fuller = -2.4064, Lag order = 6, p-value = 0.4052
## alternative hypothesis: stationary
#Create the same window to have the series at the same lenght
Ita_ts_modify <- window(Ita_ts, start=c(1996,1))

fit2 <- auto.arima(Ita_ts_modify, xreg=Benzina_ts)
summary(fit2)
## Series: Ita_ts_modify 
## Regression with ARIMA(1,0,1)(1,1,2)[12] errors 
## 
## Coefficients:
##          ar1      ma1    sar1     sma1     sma2      drift     xreg
##       0.8777  -0.3683  0.0729  -0.7431  -0.0149  -163.5631  -1.5490
## s.e.  0.0435   0.0791  0.3789   0.3747   0.2715   154.3512  24.4844
## 
## sigma^2 = 425917277:  log likelihood = -3702.49
## AIC=7420.97   AICc=7421.43   BIC=7451.27
## 
## Training set error measures:
##                    ME     RMSE      MAE       MPE     MAPE      MASE       ACF1
## Training set 544.2815 20049.32 13225.33 -6.215407 14.76927 0.6096376 0.01099098
checkresiduals(fit2)

## 
##  Ljung-Box test
## 
## data:  Residuals from Regression with ARIMA(1,0,1)(1,1,2)[12] errors
## Q* = 40.53, df = 19, p-value = 0.002786
## 
## Model df: 5.   Total lags used: 24

We try with to differentiate Gasoline price and fit it into the ARIMA model

d_Benzina_ts<-diff(Benzina_ts)
adf.test(d_Benzina_ts)
## Warning in adf.test(d_Benzina_ts): p-value smaller than printed p-value
## 
##  Augmented Dickey-Fuller Test
## 
## data:  d_Benzina_ts
## Dickey-Fuller = -7.6558, Lag order = 6, p-value = 0.01
## alternative hypothesis: stationary
#Change the window by one month (we lose an obs because we differentiate)
Ita_ts_modify2 <- window(Ita_ts, start=c(1996,2))

#Fit the model
fit <- auto.arima(Ita_ts_modify2, xreg=d_Benzina_ts)
summary(fit)
## Series: Ita_ts_modify2 
## Regression with ARIMA(1,0,1)(1,1,2)[12] errors 
## 
## Coefficients:
##          ar1      ma1    sar1     sma1     sma2      drift     xreg
##       0.8828  -0.3799  0.0725  -0.7390  -0.0194  -162.4691  17.2047
## s.e.  0.0397   0.0777  0.3669   0.3633   0.2634   145.9534  25.0419
## 
## sigma^2 = 426540455:  log likelihood = -3691.36
## AIC=7398.71   AICc=7399.17   BIC=7428.98
## 
## Training set error measures:
##                    ME     RMSE      MAE       MPE     MAPE      MASE       ACF1
## Training set 506.6749 20062.21 13264.28 -6.231399 14.81549 0.6100593 0.01050806
checkresiduals(fit)

## 
##  Ljung-Box test
## 
## data:  Residuals from Regression with ARIMA(1,0,1)(1,1,2)[12] errors
## Q* = 40.7, df = 19, p-value = 0.002645
## 
## Model df: 5.   Total lags used: 24

We create dummy variable to handle the structural break of Covid

dates <- as.Date(time(Ita_ts_modify), origin = "1970-01-01")

# Creation of time series with a dummy variable
dummy <- ifelse(dates == as.Date("2020-03-01"), 1, 0)

dummy_ts <- ts(dummy, start = c(1996, 1), frequency = 12)

Then we merge in a unique object ts to use it as a regressor

x_ts <- cbind(dummy_ts, Benzina_ts)

fit <- auto.arima(Ita_ts_modify, xreg= x_ts)
summary(fit) 
## Series: Ita_ts_modify 
## Regression with ARIMA(1,0,1)(1,1,2)[12] errors 
## 
## Coefficients:
##          ar1      ma1    sar1     sma1     sma2      drift    dummy_ts
##       0.9189  -0.4645  0.0162  -0.6894  -0.0432  -149.0849  -108737.98
## s.e.  0.0338   0.0720  0.5078   0.5046   0.3617   181.4630    16704.03
##       Benzina_ts
##           3.1875
## s.e.     22.0011
## 
## sigma^2 = 376135288:  log likelihood = -3681.52
## AIC=7381.04   AICc=7381.61   BIC=7415.13
## 
## Training set error measures:
##                    ME     RMSE      MAE       MPE     MAPE      MASE
## Training set 384.3803 18811.67 12774.73 -9.269846 17.47593 0.5888668
##                     ACF1
## Training set 0.005912887

I don’t think things are better than the normal ARIMA

#let’s take again the various countries

data_ts <-window (data_ts, start=c(1990,1))
Ita_ts2 <- window (Ita_ts, start=c(1990,1))

fit <- auto.arima(Ita_ts2, xreg= data_ts)

The MAPE is reduced.